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Abstract 

We investigate numerically the magnetic properties of the 3D Isotropic 
and Anisotropic Hubbard model at half-filling. The behavior of the tran- 
sition temperature as a function of the anisotropic hopping parameter is 
qualitatively described. In the Isotropic model we measure the scaling 
properties of the susceptibility finding agreement with the magnetic criti- 
cal exponents of the 3D Heisenberg model. We also describe several par- 
ticularities concerning the implementation of our simulation in a cluster 
of personal computers. 



1 Introduction 



Many electron systems have been for many years the playground for testing 
models of superconductivity. In this framework the Hubbard model BJ is ex- 
pected to reproduce the essential features that purely electronic degrees of free- 
dom are accountable for. In the Hubbard Hamiltonian thermal agitation is 
modeled via an inter-orbital hopping term, while electrostatic interaction is 
taken into account via an effective Coulomb coupling proportional to the charge 
density. 

The simplicity of this formulation is only apparent. The Hubbard model has 
(partial) exact solutions only in one dimension. In two or more dimensions only 
approximate techniques can be applied. Ground state properties and approx- 
imate phase diagrams have been analytically derived in the limits U — > and 
U — » oo by using Random Phase Approximation and Strong Coupling expan- 
sions respectively. It is clear that the range of validity of analytical approaches 
is always an issue, moreover we are interested in the physics of the system 
at the (more physical) intermediate values of the coupling. In those regions 
where strong and weak coupling expansions break down Quantum Monte Carlo 
(QMC) techniques are useful, not only for testing the validity of other analyti- 
cal methods such as Mean Field, but as a powerful tool to obtain first-principle 
results. 

The numerical investigation of the Hubbard model is however very costly. Away 
from half-filled bands the path integral measure is no longer positive-definite 
and Monte Carlo averages of physical observables suffer from large fluctuations. 
Extracting meaningful results requires extremely large statistics. Even in half- 
filled bands the complexity of the numerical simulation is high, as we shall 
discuss throughout the paper. One has to keep in mind that we are dealing 
with fermionic particles in a region of parameter space in which the Coulomb 
interaction is big enough for the fermion propagator to be close to singular. 
Inverting the propagator is a computationally very expensive operation. The 
most popular algorithm [Q], the determinantal method, has a computational 
complexity which grows with the cube of the system size. Therefore while the 
system has been studied extensively in two dimensions, in three dimensions the 
information we have is more restricted. 

We decided to investigate in this work the magnetic properties of the three 
dimensional Hubbard model at half-filling around the Neel phase transition. 
We concentrate on two issues: the transition temperature, and the universal- 
ity class of the transition. Pioneering works |5| on the phase diagram were 
carried out in the 1980s on small lattices. The model presents a phase transi- 
tion line in the plane (3 — U separating a region in which the system is in a 
disordered paramagnetic phase from another region in which the electron spins 
are aligned in a staggered way. The actual value of the transition temperature 
as a function of U remains an open problem. Recent numerical simulations § 
have shown that the phase transition takes place at temperatures much lower 
than expected from earlier works. Another question to be discerned numeri- 
cally is the universality class of the phase transition line. Close to the Quantum 
Critical Point at T = the transition should be mean field like (4D Heisenberg 
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exponents). Everywhere else it is expected to be in the universality class of the 
three dimensional quantum Heisenberg antiferromagnet Q. 
In recent years the development of clusters of commercial processors has boosted 
the computing capabilities at research institutes. This low weight approach to 
high speed computing is becoming a real alternative to more conventional ap- 
proaches based on parallel supercomputers traditionally developed by industry. 
While price is an obvious advantage in favor of hand made clusters of PCs, 
effectiveness is both a model and algorithm dependent issue. 
Obvious cases for massive cluster simulation are problems that, while not re- 
quiring a huge amount of memory, do require the simulation of many copies of 
the same program. The speeding up is achieved by setting up different starting 
conditions for the different copies of the program. On single processors, high 
performances can be achieved by using the capabilities of recently developed 
processors, such as vectorization ||. In this situation the so-called farm method 
on clusters of PCs is a cost-effective, relatively easy to use, source of computer 
power. The previous statement is only partially true in our case. It is not ob- 
vious that the farm method is the best solution for the Hubbard model. Warm 
up times in large lattices might be very large and to have thermalization effects 
under control it is desirable to have a long single Monte Carlo history. 
These and related questions will be addressed here. Simulations have been 
carried out on the PC cluster at the Center for Data Intensive Computing in 
Brookhaven National Laboratory. It consists of about 150 Pentium III proces- 
sors. Clock speeds range from 500 MHz to 1GHz, and the available DRAM is 

1 Gbyte per processor. Communication between the processors is achieved via 
a commercial Fast Ethernet switch which provides a one-way bandwidth per 
channel of up to 100 Mbits/sg. 

The paper is organized as follows: we first briefly review the model and al- 
gorithm in section 2 in order to fix our notations; the numerical simulation, 
observables and dynamics of the Monte Carlo process is discussed in section 
3. Our results for the Isotropic and Anisotropic Hubbard model are described 
in sections 4 and 5 respectively. Section 6 contains conclusions and comments 
about future perspectives. We left for the appendix some particularities related 
to our implementation of the simulation in a cluster of PCs. 

2 Model and Numerical Algorithm 

In the following we summarize the standard procedure to perform Monte Carlo 
simulations on the Hubbard Model [||, ||, ||] and fix our notation. Consider the 
Hubbard Hamiltonian at half-filling 

H=-t£ (4»°ja + 4*^) + U J> i+ - 1/2) (n ; _ - 1/2) = K + V . (1) 
(ij>,a i 

Here the index i = 0, . . . , V = L D labels the sites of a lattice of side L in D 
spatial dimensions on which periodic boundary conditions have been imposed; 
c\ a and c- ia are respectively the creation and annihilation operators for electrons 
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with a z-component of spin a at the site i; n; a = c\ a Ci a denotes the usual number 
operator. The sum (i j) is over all pairs of nearest neighbors on the lattice. The 
first term models the thermal agitation, whose strength is characterized by the 
hopping parameter t; the second term corresponds to an electrostatic Coulomb 
repulsion of intensity U. 

The path-integral representation is obtained by introducing an imaginary time 
coordinate r, and considering the action of all possible configurations of the 
fields between r = and r = (3. The partition function of the equivalent 
statistical model reads 

Z = Tr(e- § ) =Tr(e^ ft ) . (2) 

In order to perform a numerical simulation the theory is defined on a lattice 
in space and time dimensions. The partition function of the discretized theory 
can be written as 

Z = TV (e- A -K-ArV )Nt _ (3) 

where we defined j3 = At Nt, with At lattice spacing in the temporal direction 
and Nt number of time slices. The role of (5 is analogous to the inverse of the 
temperature T of a classical statistical model in D + 1 dimensions. 
In order to have a well defined relative probability for each configuration in 
phase space, fermions must be integrated out analytically in the partition func- 
tion (H). In a first step the kinetic and potential terms are separated in the 
partition function by the splitting 

Z = Tr (e" ArI V Ar ^) Nt +C(Ar 2 [K,V]) . (4) 

We will neglect the contributions coming from the second term on the r.h.s 
of eq. (0j). The leading order in the error introduced by the so-called Trotter 
aproximation is proportional to the square of At. This systematic error has 
to be kept under control in actual numerical simulations by choosing At small 
enough (i.e. smaller than the statistical error). 

The kinetic part of the Hamiltonian is a quadratic form in fermion fields, cal- 
culating the trace is therefore trivial. In order to write the interaction term as 
a quadratic form as well one introduces a set of auxiliary boson fields [ pX| ] 

e -ArU(n i+ -l/2)(ni_-l/2) _ e~ ArU / 4 ^ e _Ar CTi (l)A(n i+ -n i _) ^ ^ 

<Ti(l)=±l 

where {c}(l) denotes an Ising field defined in the spatial lattice at the time slice 
1 = 1, . . . ,N t . The constant A is related to the parameters of the Hamiltonian 
by the equation cos1i(At A) = exp(ATU/2) for positive A. 
After all these manipulations it is possible to perform the trace in (^|) yielding 

Z = det M+ det M~ , (6) 

Mi)} 

with 

M a = I + B^ t B^ t _ 1 . . . B? = 1 + A Q (N t ) . (7) 
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where we have defined the matrices 



= e TAAra5 ijCTi (l) e -ArK ^ 

A«(l) = BfBf_ 1 ---B?B^---Bf +1 . (9) 

Summarizing, we have substituted the local fermionic interaction in the par- 
tition function by an Ising model with a complicated multi-spin interaction. 
The remaining sum over the Ising field configurations in (^) can be computed 
by standard Monte Carlo techniques. We adopt here the approach proposed 
by Blanckenbecler and coworkers ||, extended to the Hubbard model at low 
temperatures in [||, [|. 

We refer to the above cited works for a discussion on the details of the algorithm. 
In essence the update mechanism is based on the updating of the equal-time 
Green function for an electron of spin a propagating through the field created 
bya(l) 

G a (l)ij = (T[c iQ (lAr)c] Q (lAr)]) = [1 +A*(l)]y 1 , (10) 
with T denoting the temporal ordering operator. 

The Green function turns out to be the fundamental object of the simulation 
since it contains the information needed to update the field cr(l). Besides, along 
the simulation, observables like the energies and the local magnetic moment are 
calculated as expectation values of certain matrix elements of G(l). 
The computation of the Green function is unfortunately also the most expensive 
part of the algorithm in terms of computing time. The numerical evaluation 
of eq. (|T^ ) requires performing N t multiplications of matrices of dimension 
V. That requires order N t xV 3 operations plus the inversion of the resulting 
matrix, which takes of order V 3 operations. 

Timings get worse at low temperature since the matrices Bj get more and more 
ill-conditioned when increasing (3. The computation of the product in eq. (|Io|) 
is then plagued with round-off errors. Obtaining a meaningful result requires 
intermediate re-orthogonalizations in order to isolate the divergent scales in the 
matrix product |J. For very large values of the Green function cannot even 
be calculated in a computer due to finite precision problems. 
The situation can be partially alleviated by realizing that eq. ( |l0|) immediately 
implies 

G Q (1 + 1) = B Q (1 + 1)G Q (1)B Q ~ 1 (1 + 1) , (11) 

which can be used to "advance" the Green function from time slice 1 to 1+ 1. 
The significant reduction in number of operations, comes at the price of in- 
creasing the round-off errors. Due to this fact eq. ( J1~T| ) can be used a limited 
number of consecutive times, say till round-off errors become of the order of 
the statistical ones. One then has to recompute G(l) according to eq. (|lC|). 
For the reasons discussed above, at low temperatures, say (3 > 6, the range of 



applicability of eq. (11) is very limited. 

It is clear at this point that to accelerate the simulation we have to concentrate 
efforts in speeding up matrix operations, in particular the matrix multiplication. 
We address this point on more quantitative grounds in the Appendix, where 
our implementation of the algorithm in a cluster is also discussed. 
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3 The Monte Carlo Simulation 



We have run numerical simulations on the general anisotropic Hubbard model 
in d = 3 

h = -t ( c L c j« + 4« CiQ ) - tz Yl ( c ^ c j« + c J« Ci °) 

(ij>,a 

+U £> i+ - 1/2) (n ; _ - 1/2) . (12) 

i 

In this notation tz and t represent respectively the inter-planar and the in-plane 
hopping parameters. Varying tz the system undergoes a crossover between the 
purely two dimensional behavior at tz = and the three dimensional isotropic 
case at t = tz. Intermediate values of tz model situations in which the material 
is better represented by a weakly coupled set of two dimensional layers, than 
by a three dimensional isotropic lattice. 

The phase diagram of the Hubbard model in d = 3 contains a phase transition 
line in the plane (3 — U. The high temperature phase is paramagnetic while 
in the low temperature region the ground state is an antiferromagnet with the 
electron spins oriented staggered wise in all spatial directions. The limiting 
behavior of the model for large U corresponds to the three dimensional quan- 
tum Heisenberg model. At U = the system becomes a gas of non-interacting 
electrons which shows no transition at all. 

Along the numerical simulation we measure the kinetic and the Coulomb ener- 
gies via the expectation value of the following operators 

ek = 4D " X^^+A") > ( 13 ) 
e c = AA-^(n i+ni _) . (14) 

i 

where the index p, = 0, . . . , 5 denotes the six spatial directions and the normal- 
ization factor N = l/(VN t ) accounting for the sum over spatial and temporal 
lattices has been used. We also measure the local magnetic moment defined as 

S 2 =AA^((n i+ - ni _) 2 ). (15) 

Both, energies and magnetic moment, can be expressed as appropriate combi- 
nations of matrix elements of the Green function. 

The Ising variables are coupled to the z-component of the electron spin at each 
site. Taking this into account we construct the order parameter in the following 
way. At each time slice 1 we define 

m ita g = ^E(- 1 ) X+y+Z -^y Z , (16) 

i 

where (x,y,z) are the coordinates of site i. The average over configurations is 
defined by 

MLgH^Ktag) 2 ) • (17) 
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Adding up the contributions of all time slices we get 



^E^tag- (18) 



M s tag 

1 1 

Along the simulation we have set t = 1. In this paper we concentrate on the 
results for the isotropic case t = tz. Some exploratory studies at tz < t have also 
been performed. In all cases we work at fixed U and sweep through (3 looking 
for the value at which the magnetic susceptibility has a maximum. To keep 
systematic errors under control we use always At ~ 0.125. We are interested 
in the scaling properties of the susceptibility at the phase transition in order 
to measure the magnetic critical exponents. For this purpose we concentrate 
most of our statistics at a single value of the Coulomb interaction, U = 6. Here 
we run lattice sizes L = 4, 6, 8, 10 with Monte Carlo times ranging from 10 5 for 
L = 4 to 10 4 for L = 10 at each (3 value. We discard between 10% and 20% of 
the data as thermalization time, depending on the parameter space point and 
the lattice size. The total computing time spent is the equivalent of about 240 
months in a Pentium III processor at 1GHz. 

To assess the statistical quality of our data, following [11] we define the unnor- 
malized autocorrelation function for the observable O 

N-t 

co(t) = w - r -Y, ^-( ) 2 ■ ( 19 ) 

i=l 



as well as the normalized one 

C O (0) 



Pod) = M • (20) 



The integrated autocorrelation time for O, Tq\ can be measured using the 
window method 

^(t] = |+E«)(0 • (21) 
t'=i 



p=3 
P=7 



1U 
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Figure 1: Autocorrelation function of the Staggered Magnetization (a) and of 
the kinetic Energy (b). Note the change in the scale between the two plots. 
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for large enough t, which is in practice selected self-consistently. We use t in 
the range 5r mt , 10r int , and we check that the obtained r int remains stable as 
the window in t is increased. 

In Figure |l| we plot the autocorrelation function in L = 4 for the staggered 
magnetization (left side) and for the kinetic energy (right side). We observe 
the staggered magnetization building up stronger autocorrelations as the phase 
transition is approached. From this point of view the kinetic energy seems to 
be insensitive to changes in the temperature. We conclude from here that the 
order parameter is a better observable to discuss the onset of criticality on the 
model. 



P=2 



0.3 - 




MC time 



Figure 2: Monte Carlo evolution of M stag at j3 = 2 (upper part) and (5 = 5 
(lower part) for a L = 6 lattice at U = 6. 



4 Isotropic Hubbard model 

We have run numerical simulations in lattices ranging from L = 4 to L = 10 to 
investigate the magnetic behavior of the system around the Neel phase tran- 
sition. For this purpose we have measured the order parameter defined in eq. 
(|i~g|) and the staggered magnetic susceptibility 

Xstag = V <(M stag ) 2 ) , (22) 

which is a monotonically increasing function of (5 for so is (M s t ag ) 2 . 
At low (3 values, high temperatures, the system is in a disordered paramagnetic 
phase. The mean value of the order parameter is zero up to corrections pro- 
portional to 1/V. In figure ||, upper part, we plot the MC evolution of M stag 
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at/3 = 2inaL = 6 lattice with U = 6. As temperature decreases, and the 
magnetic phase transition is approached, the value of the order parameter in- 
creases indicating the tendency of the electron spins to organize themselves in 
a staggered way. On the bottom of the same figure the MC evolution of M stag 
at/3 = 5inaL = 6 lattice at the same value of U is plotted. We see the system 
is flipping back and forth between the disordered paramagnetic phase and the 
staggered ordered one. There is a constant factor (1 — e~ AtU ) _1 relating the 
two-point correlation functions expressed in terms of the electron spin with the 
ones expressed in terms of the Ising fields |l0|. Our plots of magnetic variables 
contain already this factor. 
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Figure 3: M stag versus (3 for different values of U in L = 4. 
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Figure 4: Asymptotic value of M stag versus 1/L at U = 6. 

The staggered magnetization is an increasing function of (5 for fixed U. In 
fact, for all the lattice sizes we investigated M s t ag increases with (5 reaching 
a maximum at some point, and developing a plateau afterwards. The onset 
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of such plateaus is related to the maximal value that the magnetization can 
reach in that particular lattice size. In figure |3] we display the results for an 
L = 4 lattice for different values of the Coulomb interaction. The values of the 
observables on the plateaus can be viewed as the asymptotic values in the T = 
limit. 

In figure || we plot those asymptotic values versus the inverse of the lattice 
size. Spin wave theory predicts that the fluctuations giving rise to spin-spin 
correlations decay as 1/L [12]. Our results support this prediction for L > 4 
giving a value for M s t ag in the thermodynamic limit of 0.156(3). Including L = 4 
the best fit is a 1/L 2 extrapolation which leads to a value of M sta g compatible 
with the previous one. We can therefore not give a conclusive answer to this 
issue but we are inclined to prefer the fit without the small lattice because finite 
size effects are likely to be uncontrolled for L = 4. Indeed, the asymptotic value 
of (M s tag) 2 starts to stabilize only from L = 6 on. 

Next, we focus on the critical behavior of the system. We concentrate our largest 
statistics in this particular aspect. Our aim is studying the scaling of the order 
parameter and the susceptibility close to the phase transition temperature, and 
finally extracting the magnetic critical exponents. A first question arising is the 
actual value of the transition temperature. The most recent work the authors 
are aware of || quotes a value T c ~ 0.3 for the Neel transition at U=6. This 
temperature is certainly much lower than the ones reported in the pioneer works 
of the 1980s (see eg. ||). Our purpose is to give an estimation of the critical 
temperature based on the measurements of the order parameter. 
In figure ^ we plot the histograms corresponding to the time evolution of M stag 
for increasing lattice sizes at (3 = 4 (lower plot) and (3 = 5 (upper plot). The 
asymmetry of the distributions at (3 = 4 indicates that we are close to a phase 
transition. However the system is still clearly in the paramagnetic side because 
when increasing the lattice size the peak of the magnetization in the param- 
agnetic region tends to dominate the distribution while the other runs away. 
The behavior is radically different at (3 = 5. The peak corresponding to the 
paramagnetic phase decreases when the lattice size gets bigger, corresponding 
therefore to a finite size effect. Conversely, the peak in the symmetry broken 
phase tends to grow. Summarizing, the system is already in the antiferromag- 
netic Neel phase at (3 = 5 because for increasing lattice size the system stabilizes 
in the staggered phase. 

From the order parameter distributions it is clear that the magnetic transition 
takes place between T = 0.25 and T = 0.20. Our results therefore support the 
ones of Muramatsu and coworkers in the sense that the phase transition occurs 
at a value much lower than traditionally expected. The fact that we measure a 
value even slightly smaller might be related to the use of a different observable. 
In 0| the authors use cumulants of the energy to locate the critical point, while 
our results are based on order parameter measurements. In principle, different 
estimators give slightly different results in finite lattices. 

The behavior of the staggered susceptibility (eq. ^) across the parameter space 
for different lattice sizes is presented in figure ||. We observe the susceptibility 
growing monotonically until reaching a plateau reflecting the behavior of the 
magnetization itself. The saturation of the magnetization, and the subsequent 
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Figure 5: Normalized distribution of M stag at U = 6 at (3 = 4 (lower plot) and 
(3 = 5 (upper plot). The Neel phase transition takes place between this two 
values. 
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Figure 7: Log-log plot of the magnetic susceptibility at (3 = 5, U=6 versus 
lattice size. The linear fit gives a critical exponent j/u = 2.08(9). 



plateau in the susceptibility raises doubts on the interpretation of the peak in 
the susceptibility as a good observable to locate the phase transition. In prin- 
ciple, there is no reason to believe that the (5 value at which the susceptibility 
reaches the plateau has anything to do with the critical temperature of the Neel 
transition. 

We observe that the onset of the plateaus comes close to but clearly after = 5 
in all cases. Taking (3 = 5 as our best estimate for the critical temperature, and 
using the scaling law 

Xstag(T c ) oc V' u , (23) 

our result for the magnetic critical exponent should agree with the one of the 
three dimensional Heisenberg model @, that is, 7/1/ ~ 1.98. Our estimation 
for the magnetic critical exponent is in good agreement with this expectation. 
The result of our linear fit is plotted in figure [/], giving a value of 7/V = 2.08(9). 



5 Anisotropic Hubbard model 

The introduction of an anisotropic hopping parameter tz allows us to interpo- 
late between the purely two dimensional behavior (tz = 0) and the perfectly 
isotropic three dimensional lattice (tz = t). We have done some exploratory 
studies at intermediate values of tz to get some insight on the crossover behav- 
ior of the model. 

In figure || we plot the result for the kinetic energy in D = 2, 3 and at several 
intermediate values of the hopping parameter. The interpolation is smooth in all 
cases. As a function of tz, the interpolation is linear for the high temperature 
case, and faster than linear when the temperature is lowered. This can be 
understood easily taking into account that at low temperatures the correlation 
between planes is higher and compensates the smaller value of the interplanar 
tz hopping coupling. 
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Figure 8: Kinetic energy for different values of tz at U=4. The values of the 
energies interpolate smoothly between the D=3 and D=2 limits. 



In two dimensions a phase transition to an antiferromagnetic ordered phase is 
expected at T = 0. In principle we can ask for the dependence of the phase 
transition temperature on the anisotropic hopping parameter tz. Evidently 
decreasing tz will result in a decrease of the transition temperature. The ac- 
tual dependence T c (tz) is an interesting monitor of the form of the quantum 
fluctuations which disorder the ground state at T = in D = 2. 
As a first step in that direction we have measured the staggered magnetization 
for a small lattice, L = 4, for different values of tz. The order parameter flips 
between the disordered phase and the ordered one at /3 = 6 producing these 
distributions plotted in figure ||. We can observe that as tz is lowered, the system 
is the more and more in the disordered phase, meaning that the transition 
temperature in fact decreases as tz goes to zero. Eventually, the value of the 
critical temperature should go to zero, where the quantum critical point of the 
D = 2 Hubbard model is expected to be. 

6 Summary and conclusions 

We have investigated the properties of the magnetic phase transition in the 
three dimensional Hubbard model. The measurement of the order parameter 
allows us to give an estimation of the critical temperature at U = 6. From the 
scaling of the magnetic susceptibility we compute the magnetic critical exponent 
7/V which is in agreement with the magnetic exponent of the three dimensional 
Heisenberg model. 

The dependence of the phase transition temperature on the anisotropic hopping 
parameter is a very interesting project from the numerical point of view. In 
this area we are aware of results based on Dynamical Mean Field Theory and 
the Two Particle Self Consistent approach fl3|| . It would certainly be of interest 
to probe such results in a Monte Carlo simulation. 
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Figure 9: Distribution of M stag at/3 = 6inaL = 4 lattice at U = 6 for several 
values of tz. 

We have shown that computationally costly thermodynamic quantities such as 
distributions of the order parameter and critical exponents can be computed 
with moderate-large computing resources using a well known algorithm. This 
fact should not dismiss a very important issue, which is the development of bet- 
ter core algorithms. A consequence of the impressive development of computer 
technologies is that we are now able to produce results that we could not have 
dreamed of only 5 years ago. Such technical developments should go hand by 
hand with work in algorithm improvement. 
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Appendix: The Hubbard model on a PC cluster 



The question we try to answer here is how to make optimal use of the PC 
cluster to speed up our numerical investigation of the Hubbard model using the 
determinantal method. As we pointed out in the introduction the farm method 
might not been suitable to simulate large lattices due to thermalization issues. 
The answer to our question is probably that a combination of both, the farm 
method, and a parallel version of the algorithm would do best. 
The most serious problem regarding parallelization techniques applied to this 
algorithm is the extreme non-locality of the update mechanism. Let us suppose 
that we distribute the Ising spin variables among np processors in such a way 
that each processor takes care of the update of a piece of the spatial lattice. It 
is easy to show that such strategy could not work. Consider for instance the 
update of the spin a s (we will omit the time index 1 throughout this section 
for notation clarity) The update probability depends, among other things, on 
the diagonal element of the Green function G° . The crucial point is that if the 
spin a s is flipped all the elements of the Green function change M 



G g — >• f Gy + Gg ■ G°j if j/s 

G£ ~~ * + f (Gg • G° — G^) if j=s , 



(24) 



where f is a constant factor independent of the spatial coordinates. In particular 
the change affects all diagonal elements which enter in the update probability 
of all the other <r's on the the processors. This implies that every time a 
spin is flipped the recomputed Green function should be communicated to all 
processors. Synchronizing such communication is likely to be impossible. In 
any case, from a strictly performance point of view it is clear such an strategy 
could not pay off. The update of the Ising fields within a same time slice must 
therefore be sequential. 

A look at the definition of the Green function (eq. (|i~0P) tells us that the update 
of the different time slices cannot be a distributed task either. The value of the 
Green function at a particular time slice 1, depends on the state of the Ising 
spins on all the other time slices. From the previous analysis we conclude that 
the update process is inherently sequential in all dimensions. 
A possibility to still make use of computing cooperation among several proces- 
sors is to parallelize the matrix operations [14|. We observe in eq. (|24] ) that 
despite all the matrix elements Gy of the Green function change after a spin 
flip, such change can be computed from the original G;- plus a factor which 
only depends on the elements of G a belonging to the row and column of the 
particular site s being updated 
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The parallelization strategy takes profit of this regularity. The matrix elements 
of the different operators are distributed column-wise among the processors. 
When a matrix operation takes place, eg. a matrix multiplication, each pro- 
cessor computes only the part corresponding to the column it is responsible 
for. The update of the field a s is done simultaneously on all the processors. 
In principle, since the random number sequence is the same for all of them, 
the result of the update is the same on all of them. Therefore this mechanism, 
although redundant, is harmless. In the program, before the update function is 
called, the processor containing the column Gf s broadcast this column to all the 
others. As explained before this is the only information needed to recompute 
the Green function during the update process. Note that the portion of the 
s-row needed by each processor is stored locally, and therefore needs not to be 
communicated. 




Figure 10: Speed up of the simulation as a function of the number of processors 
using the parallelization method described in the text. 



In figure |10| we show the speed up in the calculation of the Green function when 
using the proposed strategy. For small and intermediate lattice sizes the farm 
method is still the best option. For big lattices (L > 10) the algorithm starts 
to scale reasonably well. It is clear that there is a bottleneck generated by 
the communication of matrix elements during the calculation. The jam tends 
to improve when the lattice size gets big because the processors have more 
operations to perform and therefore do not block the channels trying to submit 
and retrieve information constantly. 

Summarizing, the lesson to extract from here is that the algorithm is paralleliz- 
able. Using this scheme instead of a farm method pays off for big lattices. It is 
also clear that we have a very modest switch (Fast Ethernet at 100 Mbits/s).The 
performance of the parallelization is probably boosted using a more advanced 
switch for instance a Mirinet™ [15]. 
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